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A first systematic study of the effects of the choice of the input scale in global determinations 
of parton distributions and QCD parameters is presented. It is shown that, although in principle 
the results should not depend on these choices, in practice a relevant dependence develops 
as a consequence of what is called procedural bias. This uncertainty should be considered in 
addition to other theoretical and experimental errors, and a practical procedure for its estimation 
is proposed. Possible sources of mistakes in the determination of QCD parameter from parton 
distribution analysis are pointed out. 



Parton distribution functions (PDFs) describe the nucleon in high-energy collisions and, in addition to 
their intrinsic interest, play an essential role in the analysis and understanding of scattering experiments, 
for example the ones currently performed at the LHC. For this reason a considerable effort is being contin- 
uously made in order to determine the PDFs as accurately as possible. During the last few years a number 
of groups have produced publicly available PDFs using different data sets and analysis frameworks, and 
some joint initiatives aiming to improve the precision and understanding of PDF determinations in various 
aspects have also been established [1,2]. 

As is well known, the Q 2 scale 1 evolution of the PDFs f(x,Q 2 ) is calculable within perturbative QCD 
and given by the renormalization group equations (RGE). In parton distribution analyses the x-dependence 
of the PDFs is thus extracted at a particular scale Q 2 ,, usually referred to as the input scale. One aspect of 
PDF determinations which has not been systematically addressed so far is the impact of the choice of this 
input scale on the results. In the standard approach followed by most groups the input scale is fixed at one 
arbitrarily chosen value Qq > 1 GeV 2 , and the consequences of the choice of a particular value (ranging 
from 1 to 9 GeV 2 in current analyses) are ignored. 

Alternatively, the so-called dynamical parton distributions [3-6] are generated from input distributions 
at an optimally determined low input scale Qq = \l 2 < 1 GeV 2 . This has the advantage that the input 
distributions naturally tend to valence-like (positive definite) functions, i.e., not only the valence but also 
the sea and gluon input densities vanish in the small-x limit. Thus the behavior of the parton distributions 
at small x appears in the dynamical approach as a consequence of QCD dynamics, while in the standard 
approach it has to be fitted. It was shown in [5, 6], where also "standard" distributions were generated using 
Q 2 — 2 GeV 2 , that this also implies that the dynamical distributions at Q 2 > 1 GeV 2 are more restricted 
and have smaller uncertainties than their standard counterparts. Note, however, that although the dynamical 
approach provides a natural connexion with non-perturbative physics, no claim is made that the dynamical 
distributions describe the nucleon at non-perturbative scales of typically Q 2 < 1 GeV 2 ; as a matter of fact 
no comparison with data is attempted for scales lower than the kinematic cuts defined below. 



We set equal renormalization and factorization scales, as has been done in all PDF analyses so far. 



The choice of the input scale in a parton distribution analysis also affects the determination of QCD 
parameters. For example, the strong coupling constant a s (M|) is determined in [5, 6] together with the 
parton distributions and is substantially correlated with the gluon distribution which drives the QCD evolu- 
tion (consequently its uncertainty is also smaller in the dynamical case. ). At NNLO [6] we got CC S (M%) = 
0.1124 ± 0.0020 in the dynamical case, to be compared with a, (M|) = 0.1158 ± 0.0035 in the "standard" 
one. These differences should be interpreted as genuine uncertainties in the determinations; e.g., our con- 
tribution to the next PDG value [7] will be an average over dynamical and "standard" NNLO results, rather 
than a single particular value. 

The crucial observation is that once one finds an optimal solution for the PDFs (and/or QCD parameters) 
using one input scale £> 2 , it is certain that solutions exist at all other scales with equally good % 2 . As a 
matter of fact, one could find them via (forward or backward) RGE evolutions (this implies also that those 
solutions would have the same values of the QCD parameters, in particular of a 4 (M|) ). As a consequence 
of this straightforward observation one can conclude that any dependence of the results on <2„ is due to a 
certain inability of the estimation procedure to find the optimal solution. For example one can immediately 
think of shortcomings of the parametrization to reproduce the optimal shape of the distributions at the 
different input scales, i.e. parametrization bias. The observation is however much more general and applies 
to the totality of the theoretical framework, as well as to the statistical estimation; we will refer to it as 
procedural bias. 

Turning the argument around, one can use the (9» dependence of the results as an estimator of the bias 
due to the particular method used. Of course, since the true solution is not known, it is not possible in 
general to determine the absolute procedural bias. However variations with g 2 give information on the its 
relative size (i.e., with respect to the best solution found), and certainly set a lower limit on it, i.e., define a 
minimal error which, at the very least, has to be taken into account. Nevertheless, under the assumption that 
further improvements in the solution would be negligible, one could construct a measure of the procedural 
bias by quantifying variations of the results with £> 2 . 

An additional observation is that for a significant g 2 dependence to develop it is necessary to compare 
Ql values at which the shape of the distributions is quite different, e.g. a particular parametrization will 
adapt very similarly to distributions at input scales which are very close to each other. Note however that 
in the low g 2 region the parton distributions functions go through a complete rearrangement, in particular 
the gluon and sea distributions change dramatically from a valence-like structure (or even negative values) 
to a more conventional standard shape where they increase as x decreases. Thus it is possible to estimate 
the procedural bias by studying the results obtained at different input scales in these two qualitatively 
differentiated regions, as was done in [5, 6]. 

With the aim of investigating these ideas, we extend in the present paper the studies in [5, 6] to a 
systematic investigation on the effect of the input scale in the determination of parton distributions and 
QCD parameters (we will consider explicitly a s (M|) but most of the discussion applies equally to heavy 
quark masses). This studies are part of an ongoing update of the JR distributions, details on the NNLO 
framework can be found in [6], and changes for the update have been recently reported in [8]. While most 
of the changes are irrelevant for the discussion here, some of them are related to the a s (M|) values obtained 
and will be discussed below. The exact procedure for statistical estimates is also of relevance and will be 
discussed in the Appendix. 

In principle there is complete freedom for the choice of the scale at which the initial conditions for the 
RGE are taken. However, it is expected that non-perturbative dynamics become relevant and eventually 
dominant at low scales, e.g. scales of about 0.4 GeV 2 were estimated in [9] on the basis of chiral symmetry 
breaking. These are similar to (or lower than) the optimal scales considered in the dynamical determinations 
of parton distributions, which are determined on purely phenomenological grounds, i.e. the stability of the 
fits and a valence-like structure of the distributions [3-6]. We will consider here fixed scales ranging from 
0.6 GeV 2 to the 9 GeV 2 used by the ABM group [10], which is the largest one used in current fits; in 
particular we have chosen Q 2 = 0.6, 0.7, 0.8, 1, 2, 5, 9 GeV 2 . 

For the purpose of comparing results with different degrees of procedural bias we consider parametriza- 
tions with increasing flexibility, ranging from 13 to 22 free parameters in addition to a s (M|) , which is also 
taken as a free parameter in all fits. The nomenclature and precise definition of each parametrization are 
given in Table 1 . For the quark distributions u v = u — u,d v =d — d,L = u + d, and A = d — u we use the 
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Table 1: Definition of the different parametrizations used. A "yj" symbol indicates that the parameter is set free 
in the corresponding parametrization, a "-" symbol that it is set to zero. Note that the parameters N Uv ,N^ and N g 
are determined via quark-number and momentum conservation, respectively, and thus are not free. 



following parametric form: 

xf(x,Ql) =N f x a f{\ -x) b f{\ +A f Vx + B f x + CfX 2 ) (1) 

were the polynomial coefficients are progressively set free. There are no free parameters for the strange- 
quark distributions [8]. The deep-inelastic-scattering (DIS) and Drell-Yan data included in our analysis 
[6, 8] do not directly constrain the gluon distribution at high x, thus in order to increase the freedom of this 
distribution beyond the basic shape used in [6] we consider 

xg(x, Ql) - N g x"*(l- xp (l + N' g A (1 - x) 25 ) (2) 

This shape allows the input gluon to become negative in the small x region, a feature that the MSTW [11] 
group finds preferred by the HERA data if an input scale of Ql — 1 GeV 2 is used. It should be noted 
that due to the NNLO gluon-gluon splitting function, which is negative and more singular in the small-x 
region [12] than the LO and NLO ones, any NNLO gluon at low scales tend to decrease as x decrease and 
might eventually turn negative. In particular at the low scales (~ 0.6 to 1 GeV 2 ) considered here, also our 
dynamical JR09 gluon [6] exhibit this feature, even though it is generated from a definite positive input 
distribution at Q 2 a = 0.55 GeV 2 . 

In Fig. 1 we present the j 2 (over number of points for 241 1 points) values obtained at the different input 
scales for the parametrization considered in the analyses. The j 2 decrease as the number of free parameters 
increase until they stabilize at NNLO20, which indicate that the extra freedom of the second term in Eq. 2 
does not improve 2 the description of the data achieved with the simpler parametrization with = used 
in [6]. It should be emphasized that for the NNL022 fit with Q 2 () = 0.55 GeV 2 the powers in Eq. 2 turn out 
to be a g ~ 1.2 and a' ~ 1, i. e. even with this parametrization the input gluon distribution turn out to be 
valence-like as in [6]. Thus, this is a natural tendency of the input gluons at low scales and not an artifact 
of the simpler parametrization with = 0. As a matter of fact if one chooses a sufficiently low input scale 
the details of the input distribution at low x do not have much influence in the predictions at higher scales, 
say Q 2 > 2 GeV 2 , which is one of the aspects exploited in the dynamical approach to parton distributions. 

2 It might seem strange that the Rvalues for NNL022 are slightly larger than for NNLO20 at some Ql values. This has to do with 
the rescaling of the systematic errors, as explained in the Appendix. 
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Figure 1: x 2 values over number of points for 241 1 points obtained for the different parametrizations using several 
input scales 2 . 

Of more relevance for the present investigations is the fact that, as expected, the variation in ^ 2 among 
fits with different input scales also reduces as the flexibility of the parametrization increases; again the 
NNL022 version does not reduce this (already rather small) variation further. One could, of course, try to 
improve the theoretical description, by extending the parametrization or otherwise. However the suggestion 
in this article is that one can estimate the uncertainty due to the remaining procedural bias by using this 
variation. This would lead to an additional theoretical error due to procedural bias which should be added 
(as an independent source, e.g. in quadrature) to the existing experimental and theoretical errors for both 
QCD parameters and predictions based on the extracted PDFs. As a matter of fact, one could devise a 
quantitative measure of this error, say (half) the difference between the typical dynamical scale (about 0.6 
GeV 2 ) and the most different "standard" one (say 2 GeV 2 ), instead of assuming it to be exactly zero. 

Turning now to the determination of a s (M|) we present in Fig. 2 our results for the different fits. It 
can be seen that the parametrizations with less number of free parameters lead to estimations which change 
as one increases the flexibility, while the results stabilize at NNLO20; further increasing the number of 
parameters does not lead to substantially different results but generates fluctuations. Note, however, that 
in the case of a. s (M|) the variation with the input scale is not significantly reduced as the flexibility of 
the parametrization is augmented. Following the suggestion of the previous paragraph, an error of about 
0.0006 due to procedural bias would be attributed to the determinations of a s {M^) in this analysis. Note 
that this is comparable to the uncertainty stemming from the experimental uncertainties 3 of the data, which 
( depending on Q 2 ) is also of about 0.0006, as indicated (for NNLO20 only) by the band in Fig. 2. 

Although a full description of the ongoing update of our analyses will be published somewhere else 
[13], and quantitative results are not the main aim of this letter, the quoted values on a s (M|) are in need 
of some comments, in particular in relation to the results reported in [6]. In addition to a full treatment 
of the experimental correlations, which is discussed in the Appendix, changes in our analysis framework 
include the fact that for the SLAC [14] and NMC [15] data we use now the directly measured cross-sections 
(instead of the extracted structure functions) since potential problems with the extractions in [15] have been 
pointed out [18]. A wealth of deuteron data have also been included [14-17], for which description we use 
the nuclear corrections of [19]. Further theoretical improvements include the use of the running mass 



The uncertainty due to propagation of experimental errors reported here has been obtained using Ax 
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Figure 2: a s (AfJ) values obtained for the different parametrizations using several input scales. The band indicates 
the error due to propagation of the experimental uncertainties for the NNLO20 results 3 . 



definition for DIS charm and bottom production and an approximated NNLO expression [20], so that the 
semi-inclusive HERA data on heavy quark production included in our analysis up to NLO [5] are now used 
at NNLO as well. 

Another issue raised by the ABM collaboration is the necessity of including higher twist contributions 
in the description of fixed target data [10], even if moderate kinematic cuts are used to select the data 
included in the fits, in particular for SLAC and NMC data. The kinematic cuts in our G(JR) analyses [5, 6] 
were Q 2 > 4 GeV 2 in virtuality and W 2 > 10 GeV 2 in DIS invariant mass squared, and were applied to the 
Fi values extracted from different beam energies and combined. The description was good for NMC data 
and rather poor for SLAC data. However since the number of points for these experiments was rather small 
(about 100 data points for NMC and 50 for SLAC), the values of the cuts did not affect much the results in 
[5, 6]. 

This picture changes if, as the ABM collaboration does, data on the cross sections for individual en- 
ergies are used, which amounts to hundreds of data points for each experiment. Since these data are also 
used in the present analysis, the virtuality cut has been raised to Q 2 > 9 GeV 2 for the SLAC and NMC 
data; the other data sets are less sensitive to higher twist [10]. We find results which are rather similar to 
our JR09 analysis [6], with a s (M|) values about 0.113 to 0.114, depending on the input scale. If the usual 
cut of Q 2 > 4 GeV 2 is applied to these data a 4 .(Af|) rises to 0. 1 176; a tendency in agreement with the ABM 
observations [10]. The inclusion of higher twist terms in the theoretical description should strongly reduce 
the dependence of the outcome of the fits and is currently under investigation [13]. Yet another way of 
(wrongly) obtaining higher a s (M|) values is pointed out in the Appendix. 

To summarize, we have presented a (first) systematic study of the effects of the choice of the input scale 
in global determinations of parton distributions and QCD parameters. It has been shown that although in 
principle the result should not depend on these choices, in practice a relevant dependence develops as a 
consequence of what we call procedural bias. This uncertainty should be considered in addition to other 
theoretical and experimental errors, and a practical procedure for its estimation based on variations of the 
input scale has been proposed. 



5 



Acknowledgements 



We thank E. Reya for a fruitful collaboration and for carefully reading the manuscript, and S. Alekhin, 
J. Bliimlein, S. Moch, and V. Radescu for discussions. This research is supported in part by the Swiss 
National Science Foundation (SNF) under contract 200020-138206. Authored by Jefferson Science As- 
sociates, LLC under U.S. DOE Contract No. DE-AC05-06OR23177. The U.S. Government retains a 
non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce this manuscript for U.S. 
Government purposes. 

Appendix 

An improvement in our current analysis framework as compared to the published in [6] is a complete 
treatment of the systematic uncertainties of the data including experimental correlations. The least-squares 
estimator that we use to take them into account has been explicitly written down in Appendix B of [21]; 
we repeat it here for convenience of the reader, and to fix the notation for further discussion. The global 
j 2 -function is obtained as the sum of the functions of each of the data sets included in the analysis. 
These consist of i — l,...,N data points of central value D,, total uncorrected error A, (statistical and 
uncorrected systematic errors added in quadrature), and correlated systematic errors A J; for j = 1, . . . ,M 
sources. Denoting the respective theoretical predictions as 7], the j 2 function for a data set is: 

N i / M \ 2 M 

M^ A+IoVd +Eo 2 (3) 

'=l A i \ 7=1 / 7=1 

In addition to the numerical minimization which determines optimal values for the parameters on which the 
theoretical predictions depend, one could also determine the systematic shifts r ; - via numerical optimization. 
However, since the dependence on this parameters is simple, it is also possible to do it analytically [21] 

m N r> — t a'aa 

rj = -L*£ B » B , = L A AA ^ = 5| + I^ (4) 

k=l i=l i=l 

where 8j k denotes here the Kronecker delta. Note that in this expression the errors have been regarded as 
independent quantities. This is usually true for the statistical (counting) errors, while the systematic errors 
are often proportional to the central values (typical examples are acceptance corrections and luminosity 
uncertainties), which requires a careful treatment. 

Denoting [22] the relative errors as 8 ~ g (for each point and all different kind of uncertainties), one 
would naively use Aji = SjiDi, A? = 8 2 at t D 2 + 8 unc ,iD 2 in the above equations. However this is well-known 
to result in a biased estimation of the theory. First of all (even in the absence of correlations) data points 
with smaller central values wold have smaller uncertainties, leading to a bias towards small theoretical 
predictions. Moreover, when correlations are taken into account, points with larger central values would 
be shifted (for a particular rj) more than those with smaller ones, thus resulting in smaller theoretical 
predictions (following the smallest central values) and larger systematic shifts (to accommodate the points 
with larger ones). This effect is very important for normalization uncertainties, leading to shifts of several 
(negative) units and clearly biased estimators (see also [23] for some simple illustrations). 

A solution to this problem is to take the absolute systematic errors proportional to theoretical predictions 
instead of proportional to the data. Note, however, that merely setting A,-, = 5,,?], 

in Eqs. 3-4 also leads to biased results, in this case towards larger theory values since it would be possi- 
ble to decrease J 2 by merely increasing the theory estimations, thus increasing the systematic errors and 
decreasing the shifts. An unbiased solution can be found iteratively: one scales the errors as above us- 
ing a (fixed) initial theory T = T^°\ finds a first "optimal" solution and proceed to update the errors 
by rescaling them using T = as fixed theory, thus iterating the process until the result is stable, say 
— j{n-i) U p t0 tne d es i re( j precision. In practice the algorithm converges very rapidly (already at the 
second/third iteration). Note that this is the method used for the combination of inclusive HERA data [22]. 

A note of warning. If one uses this procedure for % 2 in parameter scans, i.e. to determine the j 2 profile 
by fits at fixed values of a parameter, one encounters again a bias towards larger theory estimators, for 
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Figure 3: a s (M|) scans for our NNLO20 parametrization at <2„ = 1 GeV 2 . The iterative result is compared with 
the estimation from a scan with fixed errors (obtained by using the iterative result for scaling), and with the biased 
result that would be obtained by continuously rescaling the systematic errors (biased). 

the same reasons as before (j 2 is smaller for the parameter values which produce larger systematic errors). 
However, this kind of scans can lead to a sensible (unbiased) value for the scanning parameter if a fixed 
theory is used for the rescaling; in this case the Rvalues depend on the chosen theory but are at least 
comparable. The situation has been illustrated in Fig. 3, where we present a a s (M|) scan obtained with 
a fixed theory (NNLO20 at Ql — 1 GeV 2 from the main text), and compare it with what is obtained by 
continuously rescaling the systematic errors. This could be one of the reasons for the higher a s (M|) values 
reported by some PDF groups, e.g. [24] and/or [25]. 
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